Tutorial for Data Exploration Tool - Lantern Part 1

Overview

Lantern is a python module for a toolkit collection for data exploration from a variety of dataset to visualization.

In this post, I will walk through the followings:

  • How to set up lantern
  • What lantern can do
    • dataset
    • plot (visualization)
    • grid (interactive table view)
    • widget

How to set up Lantern

In [90]:
# !pip install pylantern 
# !jupyter labextension install pylantern # for jupyter lab

Dataset

The available dataset (as of Mar 2019) are as follows:

Dummy data from Mimesis

  • person
  • people (multiple records of person)
  • company
  • companies (multiple records of company)
  • ticker
  • currency
  • trade
  • superstore

Simply test visulization

  • line
  • bar
  • scatter
In [21]:
import lantern as l
import matplotlib.pyplot as plt
%matplotlib inline

Person

In [4]:
# people from Mimesis - Fake Data Generator 
l.person()
Out[4]:
{'first_name': 'Jaleesa',
 'last_name': 'Nicholson',
 'name': 'Jaleesa Nicholson',
 'age': 66,
 'gender': 'Female',
 'id': '29-33/65',
 'occupation': 'Hospital Orderly',
 'telephone': '269.084.6405',
 'title': 'Miss',
 'username': 'crossways-1817',
 'university': 'Bridgewater State University'}
In [33]:
# multiple records of person with locale
l.people(count=5, locale='en')
Out[33]:
age first_name gender id last_name name occupation telephone title university username
0 57 Rasheeda Female 47-77/04 Burks Rasheeda Burks Book-Keeper (480) 823-9546 M.Eng. Middle Georgia State University uninfluencing.1910
1 28 Marin Female 53-75/38 Cain Marin Cain Negotiator (324) 675-5533 Ms. University of Massachusetts Lowell (UMass Lowell) costumey1871
2 28 Nery Female 03-95/02 Moran Nery Moran Doctor (961) 775-0708 M.Des California State University, Dominguez Hills (... lizard.2046
3 27 Quincy Male 60-41/55 Sharp Quincy Sharp Rally Driver +1-(508)-040-9027 MMath Savannah State University deermeat-1860
4 42 Weston Male 78-67/26 Glass Weston Glass Clerk 385.314.2775 Mr. University of California, Santa Cruz (UC Santa... Mazard1862
In [34]:
# Visualize people
people = l.people(count=50, locale='en')
people['gender'].value_counts().plot(kind='bar');
In [35]:
people['age'].hist();
In [36]:
people['occupation'].value_counts().plot(kind='bar');
In [37]:
people['university'].value_counts().plot(kind='bar');

Company

In [13]:
# company
l.company()
Out[13]:
{'name': 'Davis-Nelson',
 'address': '56119 Aaron Street\nEast Robert, AR 89379',
 'ticker': 'XFNS',
 'last_price': 25.235810227983535,
 'market_cap': 73150866464,
 'exchange': 'CJ',
 'ceo': 'Taylor Vaughan',
 'sector': 'Health Care',
 'industry': 'Biotechnology'}
In [17]:
# Multiple companies
l.companies(count=5)
Out[17]:
address ceo exchange industry last_price market_cap name sector ticker
0 429 Fisher Flat Apt. 416\nMurphyburgh, PA 67479 Christopher Graves C Food Products 55.544594 76049625384 Sullivan-Rios Consumer Staples ZIXP
1 335 Valentine Islands\nBrownfort, NY 85152 Natasha Campbell DVM N Diversified Consumer Services 2.265270 80741988205 Andrews and Sons Consumer Discretionary YRNP
2 70911 Jessica Villages\nComptonland, NY 73451 Jeffrey Hamilton N Industrial Conglomerates 31.767192 65460412214 Lucero-Turner Industrials EFZC
3 556 Gordon Mountains\nRodriguezborough, WI 79289 Anita Hernandez D Beverages 66.738549 26427166124 Carrillo and Sons Consumer Staples QKT
4 5195 Devin Bypass\nWest Kaylee, DE 43590 Lori Webb D Media 75.953462 63822924249 Lewis-Hardin Consumer Discretionary XKF
In [39]:
# Visualize comapanies 
companies = l.companies(count=50)
companies.columns.values
Out[39]:
array(['address', 'ceo', 'exchange', 'industry', 'last_price',
       'market_cap', 'name', 'sector', 'ticker'], dtype=object)
In [41]:
companies['exchange'].value_counts().plot(kind='bar');
In [44]:
companies['industry'].value_counts().plot(kind='bar');

Financial

In [57]:
[l.ticker(country='us') for i in range(10)]
Out[57]:
['600610.F',
 '896852.YV',
 'NIM.RN',
 'SIH.YK',
 '052892.WR',
 'UBPV.KG',
 'WSFN.O',
 'ACXT.DG',
 'OTZJ.LH',
 'TXWK.SY']
In [56]:
[l.currency() for i in range(10)]
Out[56]:
['MNT', 'LAK', 'PGK', 'MYR', 'MKD', 'TWD', 'LKR', 'GIP', 'KWD', 'BDT']
In [59]:
l.trades(count=5)
Out[59]:
exchange industry last_price market_cap name price sector ticker volume
0 I Health Care Providers & Services 85.551831 32523041272 Davis, Sandoval and Mcdonald 81.366530 Health Care WZKE 520
1 F Metals & Mining 97.309630 90466817603 Mcdaniel, Howell and Ayala 102.132817 Materials UPI 490
2 CB Oil, Gas & Consumable Fuels 12.421688 21515138511 Benson, Williams and Hansen 11.528355 Energy QITK 150
3 CB Paper & Forest Products 0.873538 92197354034 Wright and Sons -3.632086 Materials YEVG 130
4 CB Oil, Gas & Consumable Fuels 10.145729 16841436869 Carrillo-Bennett 6.587095 Energy KZJM 160
In [63]:
# Visualization
trades = l.trades(count=50)
trades['price'].hist(bins=50).plot();
In [66]:
trades['sector'].value_counts().plot(kind='bar');
In [69]:
### General Purpose
l.superstore(count=5)
Out[69]:
Category City Country Customer ID Discount Order Date Order ID Postal Code Product ID Profit Quantity Region Row ID Sales Segment Ship Date Ship Mode State Sub-Category
0 Consumer Staples Stevenland US 7292 LZ 95.30 2019-03-12 20-2566576 58744 GCPI9739708615674 449.30 530 Region 2 0 2700 C 2019-03-17 Second Class South Dakota Food Products
1 Financials East Mary US DOH N55 85.62 2019-02-20 06-7296634 70184 PNDA6797384496071 256.27 810 Region 3 1 9300 A 2019-03-17 Second Class Nebraska Capital Markets
2 Financials Anamouth US HBF3273 11.15 2019-03-16 74-1777742 67157 KKRM9806243351663 84.00 150 Region 2 2 9300 A 2019-03-18 First Class South Dakota Banks
3 Utilities West Davidmouth US BVP-2355 73.75 2019-02-06 35-1306159 93960 RSZA1696539299999 145.98 860 Region 1 3 9900 D 2019-03-29 First Class Arizona Multi-Utilities
4 Industrials New Josephburgh US Q92-27J 83.24 2019-03-26 59-8893642 12242 HAIG9080650518445 101.82 410 Region 1 4 2000 C 2019-03-28 Second Class Maryland Building Products
In [75]:
# Visualization
superstore=l.superstore(count=50)
superstore['Country'].value_counts().plot(kind='bar');
In [74]:
superstore['Profit'].plot(kind='hist');
In [77]:
superstore['Sales'].plot(kind='hist');
In [78]:
superstore['State'].value_counts().plot(kind='bar');
In [93]:
import cufflinks as cf
from plotly.offline import download_plotlyjs, init_notebook_mode

cf.go_offline()
init_notebook_mode()

Area

In [94]:
l.area().head()
Out[94]:
MQW.SX RFH.RI NGG.TW HGF.EV XOK.DF
2015-01-01 0.475973 0.547789 -0.251782 0.042615 0.334737
2015-01-02 1.325388 1.506319 -1.867697 -0.498312 -0.011880
2015-01-03 3.005016 3.641948 -2.302381 0.076577 0.676959
2015-01-04 3.146639 4.725656 -2.404919 -0.294042 2.678800
2015-01-05 2.221775 4.982472 -1.289195 0.590657 0.679165
In [95]:
l.area().iplot(kind='area', fill=True)

Bar

In [96]:
l.bar().head()
Out[96]:
OOO.IJ KQO.YN LER.XS KQH.VI SBZ.PT
2015-01-01 -0.646968 0.733287 -1.436145 -1.152698 -0.342884
2015-01-02 0.987392 -0.301658 -1.139406 0.852181 -0.765065
2015-01-03 3.193570 1.274238 0.063166 -0.403930 -0.592812
2015-01-04 3.552250 0.842333 0.969290 -1.189348 1.361219
2015-01-05 2.261133 1.254262 1.173825 -1.568101 2.286603
In [97]:
l.bar().iplot(kind='bar')

Box

In [98]:
l.box().head()
Out[98]:
QOR.YP YJN.UC JAI.RR EBC.QC HFH.UI
0 7.615681 10.496136 4.615632 5.841398 6.940979
1 21.126128 7.007890 6.240759 5.392560 5.556378
2 1.002351 0.633254 1.654453 0.385170 3.620671
3 6.404937 6.213862 2.171438 9.474788 5.510254
4 1.615638 0.895569 1.616745 0.816806 2.987322
In [99]:
l.box().iplot(kind='box')

Comments

Comments powered by Disqus